Modulation spectrum for pitch and speech pause detection
نویسنده
چکیده
This paper describes a new approach to the speech pause detection problem. The goal is to safely decide for a given signal frame whether speech is present or not in order to switch an automatic speech recognizer on or off. The modulation spectrum is introduced as a method to determine the amount of voicing in a signal frame. This method is tested against two standard methods in pitch detection.
منابع مشابه
Speech Modulation Features for Robust Nonnative Speech Accent Detection
In this paper, we propose to use speech modulation features for robust nonnative accent detection. Modulation spectrum carries long term temporal information of speech and may discriminate accents of native and nonnative speakers. For each speech segment to be tested, we extract a 10 dimension feature vector from modulation spectrum and use it for model training and testing. The proposed modula...
متن کاملThe Prosody of Discourse Structure and Content in the Production of Persian EFL Learners
The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...
متن کاملAutomatic Detection of Brazil’s Prosodic Tone Unit
This research is focused on the automatic detection of one of the fundamental elements of Brazil’s prosody model, the tone unit. We compared the performance of using silent pause duration alone to delimit tone units and using relative pitch resets and slow pace (or post-boundary lengthening) along with silent pause duration to delimit them. The corpus used for the comparison is composed of 18 h...
متن کاملCombined Use of Speaker- and Tone-Normalized Pitch Reset with Pause Duration for Automatic Story Segmentation in Mandarin Broadcast News
This paper investigates the combined use of pause duration and pitch reset for automatic story segmentation in Mandarin broadcast news. Analysis shows that story boundaries cannot be clearly discriminated from utterance boundaries by speaker-normalized pitch reset due to its large variations across different syllable tone pairs. Instead, speakerand tonenormalized pitch reset can provide a clear...
متن کاملEarly Prosodic Manifestations of Disfluency
Theoretical models of speech production have hypothesized a relation between different types of disfluencies and the mechanisms responsible for them. Some disfluencies, such as filled pauses (e.g. ‘um’, ‘uh’) and repetitions (i.e. ‘the the’), are argued to arise from difficulty in planning, while cutoff disfluencies (e.g. ‘horiz-[ontal]’) are argued to arise from selfmonitoring. This distinctio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003